618 research outputs found

    Bootstrap Inference for Multiple Imputation under Uncongeniality and Misspecification

    Get PDF
    Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin's simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so called congenial and the complete data analysis is valid, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice imputation and analysis procedures are often not congenial, such that tests may not have the correct size and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation, and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas bootstrapping followed by imputation does. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.Comment: Updated (fixed) reference based simulation results. Now included tables which were previously not included as they were in supplementary information document. Swapped order of the two simulation studies. Added acknowledgement and funding statement

    Estimation of the linear mixed integrated Ornstein-Uhlenbeck model.

    Get PDF
    The linear mixed model with an added integrated Ornstein-Uhlenbeck (IOU) process (linear mixed IOU model) allows for serial correlation and estimation of the degree of derivative tracking. It is rarely used, partly due to the lack of available software. We implemented the linear mixed IOU model in Stata and using simulations we assessed the feasibility of fitting the model by restricted maximum likelihood when applied to balanced and unbalanced data. We compared different (1) optimization algorithms, (2) parameterizations of the IOU process, (3) data structures and (4) random-effects structures. Fitting the model was practical and feasible when applied to large and moderately sized balanced datasets (20,000 and 500 observations), and large unbalanced datasets with (non-informative) dropout and intermittent missingness. Analysis of a real dataset showed that the linear mixed IOU model was a better fit to the data than the standard linear mixed model (i.e. independent within-subject errors with constant variance)

    Joint modelling rationale for chained equations.

    Get PDF
    BACKGROUND: Chained equations imputation is widely used in medical research. It uses a set of conditional models, so is more flexible than joint modelling imputation for the imputation of different types of variables (e.g. binary, ordinal or unordered categorical). However, chained equations imputation does not correspond to drawing from a joint distribution when the conditional models are incompatible. Concurrently with our work, other authors have shown the equivalence of the two imputation methods in finite samples. METHODS: Taking a different approach, we prove, in finite samples, sufficient conditions for chained equations and joint modelling to yield imputations from the same predictive distribution. Further, we apply this proof in four specific cases and conduct a simulation study which explores the consequences when the conditional models are compatible but the conditions otherwise are not satisfied. RESULTS: We provide an additional "non-informative margins" condition which, together with compatibility, is sufficient. We show that the non-informative margins condition is not satisfied, despite compatible conditional models, in a situation as simple as two continuous variables and one binary variable. Our simulation study demonstrates that as a consequence of this violation order effects can occur; that is, systematic differences depending upon the ordering of the variables in the chained equations algorithm. However, the order effects appear to be small, especially when associations between variables are weak. CONCLUSIONS: Since chained equations is typically used in medical research for datasets with different types of variables, researchers must be aware that order effects are likely to be ubiquitous, but our results suggest they may be small enough to be negligible

    Association of invasion-promoting tenascin-C additional domains with breast cancers in young women

    Get PDF
    Introduction: Tenascin-C (TNC) is a large extracellular matrix glycoprotein that shows prominent stromal expression in many solid tumours. The profile of isoforms expressed differs between cancers and normal breast, with the two additional domains AD1 and AD2 considered to be tumour associated. The aim of the present study was to investigate expression of AD1 and AD2 in normal, benign and malignant breast tissue to determine their relationship with tumour characteristics and to perform in vitro functional assays to investigate the role of AD1 in tumour cell invasion and growth. Methods: Expression of AD1 and AD2 was related to hypoxanthine phosphoribosyltransferase 1 as a housekeeping gene in breast tissue using quantitative RT-PCR, and the results were related to clinicopathological features of the tumours. Constructs overexpressing an AD1-containing isoform (TNC-14/AD1/16) were transiently transfected into breast carcinoma cell lines (MCF-7, T-47 D, ZR-75-1, MDA-MB-231 and GI-101) to assess the effect in vitro on invasion and growth. Statistical analysis was performed using a nonparametric Mann-Whitney test for comparison of clinicopathological features with levels of TNC expression and using Jonckheere-Terpstra trend analysis for association of expression with tumour grade. Results: Quantitative RT-PCR detected AD1 and AD2 mRNA expression in 34.9% and 23.1% of 134 invasive breast carcinomas, respectively. AD1 mRNA was localised by in situ hybridisation to tumour epithelial cells, and more predominantly to myoepithelium around associated normal breast ducts. Although not tumour specific, AD1 and AD2 expression was significantly more frequent in carcinomas in younger women (age ≤40 years; P < 0.001) and AD1 expression was also associated with oestrogen receptor-negative and grade 3 tumours (P < 0.05). AD1 was found to be incorporated into a tumour-specific isoform, not detected in normal tissues. Overexpression of the TNC-14/AD1/16 isoform significantly enhanced tumour cell invasion (P < 0.01) and growth (P < 0.01) over base levels. Conclusions: Together these data suggest a highly significant association between AD-containing TNC isoforms and breast cancers in younger women (age ≤40 years), which may have important functional significance in vivo

    Perceptions of European ME/CFS experts concerning knowledge and understanding of ME/CFS among primary care physicians in Europe : A report from the European ME/CFS research network (EUROMENE)

    Get PDF
    Publisher Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland.Background and Objectives: We have conducted a survey of academic and clinical experts who are participants in the European ME/CFS Research Network (EUROMENE) to elicit perceptions of general practitioner (GP) knowledge and understanding of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and suggestions as to how this could be improved. Materials and Methods: A questionnaire was sent to all national representatives and members of the EUROMENE Core Group and Management Committee. Survey responses were collated and then summarized based on the numbers and percentages of respondents selecting each response option, while weighted average responses were calculated for questions with numerical value response options. Free text responses were analysed using thematic analysis. Results: Overall there were 23 responses to the survey from participants across 19 different European countries, with a 95% country-level response rate. Serious concerns were expressed about GPs’ knowledge and understanding of ME/CFS, and, it was felt, about 60% of patients with ME/CFS went undiagnosed as a result. The vast majority of GPs were perceived to lack confidence in either diagnosing or managing the condition. Disbelief, and misleading illness attributions, were perceived to be widespread, and the unavailability of specialist centres to which GPs could refer patients and seek advice and support was frequently commented upon. There was widespread support for more training on ME/CFS at both undergraduate and postgraduate levels. Conclusion: The results of this survey are consistent with the existing scientific literature. ME/CFS experts report that lack of knowledge and understanding of ME/CFS among GPs is a major cause of missed and delayed diagnoses, which renders problematic attempts to determine the incidence and prevalence of the disease, and to measure its economic impact. It also contributes to the burden of disease through mismanagement in its early stages.publishersversionPeer reviewe

    Using linear and natural cubic splines, SITAR, and latent trajectory models to characterise nonlinear longitudinal growth trajectories in cohort studies

    Get PDF
    BACKGROUND: Longitudinal data analysis can improve our understanding of the influences on health trajectories across the life-course. There are a variety of statistical models which can be used, and their fitting and interpretation can be complex, particularly where there is a nonlinear trajectory. Our aim was to provide an accessible guide along with applied examples to using four sophisticated modelling procedures for describing nonlinear growth trajectories. METHODS: This expository paper provides an illustrative guide to summarising nonlinear growth trajectories for repeatedly measured continuous outcomes using (i) linear spline and (ii) natural cubic spline linear mixed-effects (LME) models, (iii) Super Imposition by Translation and Rotation (SITAR) nonlinear mixed effects models, and (iv) latent trajectory models. The underlying model for each approach, their similarities and differences, and their advantages and disadvantages are described. Their application and correct interpretation of their results is illustrated by analysing repeated bone mass measures to characterise bone growth patterns and their sex differences in three cohort studies from the UK, USA, and Canada comprising 8500 individuals and 37,000 measurements from ages 5-40 years. Recommendations for choosing a modelling approach are provided along with a discussion and signposting on further modelling extensions for analysing trajectory exposures and outcomes, and multiple cohorts. RESULTS: Linear and natural cubic spline LME models and SITAR provided similar summary of the mean bone growth trajectory and growth velocity, and the sex differences in growth patterns. Growth velocity (in grams/year) peaked during adolescence, and peaked earlier in females than males e.g., mean age at peak bone mineral content accrual from multicohort SITAR models was 12.2 years in females and 13.9 years in males. Latent trajectory models (with trajectory shapes estimated using a natural cubic spline) identified up to four subgroups of individuals with distinct trajectories throughout adolescence. CONCLUSIONS: LME models with linear and natural cubic splines, SITAR, and latent trajectory models are useful for describing nonlinear growth trajectories, and these methods can be adapted for other complex traits. Choice of method depends on the research aims, complexity of the trajectory, and available data. Scripts and synthetic datasets are provided for readers to replicate trajectory modelling and visualisation using the R statistical computing software
    corecore